TSQL User-Defined Functions: Ten Questions You Were Too Shy To Ask

The Questions

“I’m creating a scalar function that will be used against a column that contains multiple NULL values. Is there a way to stop the function from being executed when a NULL is passed in as a parameter value?”
“I created a table-valued function that I want to apply to a column in each row returned by the query. However, when I try to join the function to the target table in the query’s FROM clause, I receive an error. Is there an effective way to invoke the function for each row in the result set?”
“I’ve heard you can use a function to parameterize a view, but I don’t see how you can incorporate a function into a view definition in such a way to support parameters. Can you explain how that is done?”
“I’m creating a user-defined function and want to specify a default value for an input parameter. Is that possible?”
“I created a scalar function that’s used in multiple queries, some of which can return millions of rows. Ever since I incorporated the function into the queries, performance has nosedived. Are there steps we can take to fix this?”
“I want to call the GETDATE system function from within a user-defined function, but I’ve read you cannot do this. Is there a workaround that lets me use the GETDATE function?”
“I want to execute a stored procedure from within a user-defined function, but I keep running into errors in the syntax. How do I call a stored procedure from within a function?”
“When creating user-defined functions, I like to include the SCHEMABINDING option, but in some cases, I receive an error when creating the function, saying that I cannot schema bind the function. Any idea what’s going on?”
“I’ve seen user-defined functions called in different ways in code. Is there a proper way to call a function?”
“I’m working on a query whose T-SQL code I want to encapsulate and parameterize. The query will aggregate a subset of values based on an input parameter and return the aggregated value as a column in the query results. Should I create a stored procedure or a user-defined function?”

“I’m creating a scalar function that will be used against a column that contains multiple NULL values. Is there a way to stop the function from being executed when a NULL is passed in as a parameter value?”

Yes. In fact, doing so is actually a very straightforward process, at least for scalar functions. You simply include the RETURNS NULL ON NULL INPUT option in your WITH clause, as shown in the following example (option highlighted):

USE AdventureWorks2012;

IF OBJECT_ID(N'dbo.fnGetTotalItems', N'FN') IS NOT NULL

DROP FUNCTION dbo.fnGetTotalItems;

CREATE FUNCTION dbo.fnGetTotalItems (@OrderID INT)

RETURNS INT

WITH RETURNS NULL ON NULL INPUT,

SCHEMABINDING AS

BEGIN

DECLARE @TotalItems INT

SELECT @TotalItems = SUM(OrderQty)

FROM Sales.SalesOrderDetail

WHERE SalesOrderID = @OrderID

GROUP BY SalesOrderID

RETURN @TotalItems;

END;

By default, when you call a scalar function, the database engine executes the function body whether or not a NULL is passed in as a parameter value. However, by including the RETURNS NULL ON NULL INPUT option, the database engine will not execute the function body when a NULL value is passed in. For example, the following SELECT statement returns a NULL value:

1	SELECT dbo.fnGetTotalItems(null);

When the database engine sees the NULL input value, it simply returns NULL without processing the function body. If your function supports multiple input parameters, the database engine returns NULL if NULL is passed into any one of those parameters and does not execute the function body.

However, you cannot use the RETURNS NULL ON NULL INPUT option for a table-valued function. Because a table-value function returns a full resultset (table), it is possible for the function to return data even if a parameter value is NULL. As such, the option cannot be practically applied to a table-valued function.

“I created a table-valued function that I want to apply to a column in each row returned by the query. However, when I try to join the function to the target table in the query’s FROM clause, I receive an error. Is there an effective way to invoke the function for each row in the result set?”

There is as long as you’re using SQL Server 2005 or later. Starting with SQL Server 2005, you’ve been able to use the APPLY operator to join one or more tables to a table-valued function in order to invoke that function against each row in the resultset. Prior to SQL Server 2005, you had to come up with a complex workaround to achieve this.

The best way to understand how the operator works is to look at an example. The following T-SQL creates a table-valued function that returns the total number of items sold for each sale listed in the SalesOrderDetail table of the AdventureWorks2012 database:

USE AdventureWorks2012;

IF OBJECT_ID(N'dbo.ifGetTotalItems', N'IF') IS NOT NULL

DROP FUNCTION dbo.ifGetTotalItems;

CREATE FUNCTION dbo.ifGetTotalItems (@OrderID INT)

RETURNS TABLE

WITH SCHEMABINDING

RETURN

(

SELECT SUM(OrderQty) AS TotalItems

FROM Sales.SalesOrderDetail

WHERE SalesOrderID = @OrderID

GROUP BY SalesOrderID

);

The function takes as an argument the SalesOrderID value that identifies the sale. In this case, the function returns only a single row with one column, but you can just as easily create a function that returns multiple rows. The simplest way to verify that the function is working as you expect is to run a simple SELECT statement that calls the function and passes in a SalesOrderID value:

1	SELECT TotalItems FROM dbo.ifGetTotalItems(43659);

For this example, we pass in 43659 the parameter value. The function then returns 26. However, all we’ve so far is to demonstrate that the function works as we expect. Let’s create a more complex SELECT statement that applies the function to each row returned from the SalesOrderHeader table. That’s where the APPLY operator comes in, as shown in the following example:

SELECT s.SalesOrderID, s.OrderDate, s.SalesPersonID,

f.TotalItems

FROM Sales.SalesOrderHeader s

CROSS APPLY dbo.ifGetTotalItems(s.SalesOrderID) f

ORDER BY SalesOrderID;

Notice that after we specify the SalesOrderHeader table in the FROM clause, we then include the CROSS APPLY keywords, following by the function, with the SalesOrderID column passed in as the parameter value.

The APPLY operator takes two forms: CROSS APPLY and OUTER APPLY. The CROSS APPLY combination returns rows from the primary table (in this case, SalesOrderHeader) only if they produce a result set from the table-valued function. The OUTER APPLY combination returns all rows from the primary table. In this case, both forms of APPLY return the same number of rows. The following table shows a partial list of the results returned by this query:

SalesOrderID	OrderDate	SalesPersonID	TotalItems
43659	2005-07-01 00:00:00.000	279	26
43660	2005-07-01 00:00:00.000	279	2
43661	2005-07-01 00:00:00.000	282	38
43662	2005-07-01 00:00:00.000	282	54
43663	2005-07-01 00:00:00.000	276	1
43664	2005-07-01 00:00:00.000	280	14
43665	2005-07-01 00:00:00.000	283	20
43666	2005-07-01 00:00:00.000	276	7
43667	2005-07-01 00:00:00.000	277	6
43668	2005-07-01 00:00:00.000	282	93
43669	2005-07-01 00:00:00.000	283	1
43670	2005-07-01 00:00:00.000	275	6
43671	2005-07-01 00:00:00.000	283	17
43672	2005-07-01 00:00:00.000	282	9
43673	2005-07-01 00:00:00.000	275	20

As you can see, the results include the TotalItems column, which is the number of items associated with that sale. In other words, we were able to apply the table-valued function to each row returned from the SalesOrderHeader table. If our function had returned multiple rows, the result set would have included that number of rows for each row returned by the function. For example, if the function always returned three rows for each SalesOrderID value, our resultset would include three times the number of rows than it currently does.

“I’ve heard you can use a function to parameterize a view, but I don’t see how you can incorporate a function into a view definition in such a way to support parameters. Can you explain how that is done?”

Using a function to parameterize a view has little to do with the view definition itself. It merely means you’re creating a function that either duplicates the logic of the view or calls the view within the function. In either case, you use a parameter to qualify the function’s SELECT statement.

For example, suppose we create the following view to retrieve data about the number of employees per job title:

USE AdventureWorks2012;

IF OBJECT_ID(N'dbo.JobData', N'V') IS NOT NULL

DROP VIEW dbo.JobData;

CREATE VIEW dbo.JobData

WITH SCHEMABINDING

SELECT JobTitle, COUNT(*) AS TotalEmps

FROM HumanResources.Employee

GROUP BY JobTitle;

As you can see, the view is very straightforward. The SELECT statement groups the data by the JobTitle column and retrieves a count for each group. You can, of course, create a view that is far more complex than this one, but what we’ve done here is enough to demonstrate how this all works.

Once you’ve created the view, you can test it by running a simple SELECT statement, similar to the following:

1	SELECT * FROM dbo.JobData;

Not surprisingly, the statement returns all rows and columns returned by the view. The following table provides a partial list of those results.

JobTitle	TotalEmps
Accountant	2
Accounts Manager	1
Accounts Payable Specialist	2
Accounts Receivable Specialist	3
Application Specialist	4
Assistant to the Chief Financial Officer	1
Benefits Specialist	1
Buyer	9
Chief Executive Officer	1
Chief Financial Officer	1
Control Specialist	2
Database Administrator	2
Design Engineer	3
Document Control Assistant	2
Document Control Manager	1

As with any SELECT statement that retrieves data from a view, we can further refine our SELECT statement by including the logic necessary to return the results we need, as shown in the following example:

1 2	SELECT TotalEmps FROM dbo.JobData WHERE JobTitle = 'Buyer';

In this case, we’ve merely specified a column in the SELECT list and added a WHERE clause that limits the results to those rows in which the JobTitle value equals Buyer. Now the statement returns only a value of 9 because that’s how many employees have that title.

Rather than creating a view and then qualifying the SELECT statements that call the view, we can instead create a table-valued function that incorporates the view’s logic, but also provides the ability to qualify the results through a parameter, as shown in the following example:

USE AdventureWorks2012;

IF OBJECT_ID(N'dbo.GetJobData', N'IF') IS NOT NULL

DROP FUNCTION dbo.GetJobData;

CREATE FUNCTION dbo.GetJobData (@title NVARCHAR(25))

RETURNS TABLE

WITH SCHEMABINDING

RETURN

(

SELECT COUNT(*) AS TotalEmps

FROM HumanResources.Employee

WHERE JobTitle = @title

GROUP BY JobTitle

);

Notice that the function’s SELECT statement is similar to that of the view’s except that we also include a WHERE clause that compares the JobTitle column to the @title input parameter. When you call the function, you simply pass in the job title as an argument:

1	SELECT * FROM dbo.GetJobData('Buyer');

As to be expected, the SELECT statement returns a value of 9. Chances are, however, you’ll want to incorporate the function in a more complex query. In the following example, we use the APPLY operator to join the Person and Employee tables to the GetJobData function:

SELECT

p.FirstName + ' ' + p.LastName AS FullName,

e.JobTitle,

f.TotalEmps

FROM

HumanResources.Employee e

INNER JOIN Person.Person p

ON e.BusinessEntityID = p.BusinessEntityID

CROSS APPLY dbo.GetJobData(e.JobTitle) f;

For each row returned from the joined Person and Employee tables, the GetJobData function is applied to that row, based on the value of the JobTitle column. The results will then include a TotalEmps column, which will provide the total number of employees who share the same title as the person listed in that row. The following table provides a partial list of employees, their job titles, and the number of people who share that title.

FullName	JobTitle	TotalEmps
Ken SÃ¡nchez	Chief Executive Officer	1
Roberto Tamburello	Engineering Manager	1
Rob Walters	Senior Tool Designer	2
Gail Erickson	Design Engineer	3
Jossef Goldberg	Design Engineer	3
Ovidiu Cracium	Senior Tool Designer	2
Thierry D’Hers	Tool Designer	2
Janice Galvin	Tool Designer	2
Michael Sullivan	Senior Design Engineer	1
Sharon Salavaria	Design Engineer	3
David Bradley	Marketing Manager	1
Kevin Brown	Marketing Assistant	3
John Wood	Marketing Specialist	5
Mary Dempsey	Marketing Assistant	3
Wanida Benshoof	Marketing Assistant	3

The point to all this is that we were able to encapsulate the logic of the view and do a lot more. One issue with this approach, however, is that we have two similar sets of code that would both need to be updated if the schema changed. If we want to avoid this scenario (and simplify our function in the process), we can call the view from within the function definition, as shown in the following example:

USE AdventureWorks2012;

IF OBJECT_ID(N'dbo.GetJobData', N'IF') IS NOT NULL

DROP FUNCTION dbo.GetJobData;

CREATE FUNCTION dbo.GetJobData (@title NVARCHAR(25))

RETURNS TABLE

WITH SCHEMABINDING

RETURN

(

SELECT TotalEmps FROM dbo.JobData

WHERE JobTitle = @title

);

This function achieves the results as the preceding example, but instead calls the view. That way, if the view changes in a way that does not affect the function’s SELECT statement, the function will not need to be updated, and we’ve simplified the function’s code in the process. In that sense, we have truly parameterized the view.

“I’m creating a user-defined function and want to specify a default value for an input parameter. Is that possible?”

Yes, it is possible and fairly easy to do. When defining the parameter, include the default value, along with the equal sign, as shown the following example (highlighted):

USE AdventureWorks2012;

IF OBJECT_ID(N'dbo.GetPersonTypeCount', N'IF') IS NOT NULL

DROP FUNCTION dbo.GetPersonTypeCount;

CREATE FUNCTION dbo.GetPersonTypeCount

(@type NCHAR(2) = 'IN')

RETURNS TABLE

WITH SCHEMABINDING

RETURN

(

SELECT COUNT(*) AS PersonCount

FROM Person.Person

WHERE PersonType = @type

GROUP BY PersonType

);

In this case, we’ve merely specified that the default value for the @type parameter is IN. However, if we want to use the default value when calling the view, we must specify the default keyword as the parameter value, as shown in the following example:

1	SELECT * FROM dbo.GetPersonTypeCount(default);

The function will now use the IN default value and return a value of 18484. That said, even though we’ve defined a default value, we can still specify that value when calling the function:

1	SELECT * FROM dbo.GetPersonTypeCount('in');

Or we can specify a different value, as we would if no default had been specified:

1	SELECT * FROM dbo.GetPersonTypeCount('em');

This time around, our SELECT statement returns a value of 273.

“I created a scalar function that’s used in multiple queries, some of which can return millions of rows. Ever since I incorporated the function into the queries, performance has nosedived. Are there steps we can take to fix this?”

Yes. Get rid of the function. Scalar functions are the bane of DBAs everywhere. The database engine has a habit of executing a scalar function for each row of data returned by the query. Even if a function can return only a dozen possible values, it might still run millions of times. You would think the query optimizer would be smarter than this. It is not.

If your query returns relatively few rows, a scalar function is usually no big deal and you can enjoy the encapsulation, parameterization, and ease of implementation that the function affords. But turn those few rows into millions and you’re suddenly faced with an abundance of extra executions, which translate into unnecessary disk I/O as well as hits on memory and processing resources. Plus, you can incur extra locking, which can further impact concurrency and performance.

To complicate matters, the estimated and actual execution plans that your queries generate are far from reliable. The true cost of the execution is often buried within the plan properties. You might even have to look to profile traces to get a more accurate picture of what the function is doing.

When possible, consider turning your scalar function into an inline table valued function. Such a function returns a table, rather than a single value (as is the case with a scalar function). However, you can create a table-valued function that returns only a single column with a single row, giving you results comparable to the scalar function. Best of all, the database engine normally executes the table-value function only once, regardless of the number of rows returned by the query, resulting in far better performance.

Assuming you can turn your scalar function into a table-value one, you would then incorporate the function into your query’s FROM clause when retrieving the data, rather than in the SELECT list. For example, suppose we were to start with a basic scalar function similar to the following:

USE AdventureWorks2012;

IF OBJECT_ID(N'dbo.fnGetTotalItems', N'FN') IS NOT NULL

DROP FUNCTION dbo.fnGetTotalItems;

CREATE FUNCTION dbo.fnGetTotalItems (@OrderID INT)

RETURNS INT

WITH SCHEMABINDING AS

BEGIN

DECLARE @TotalItems INT

SELECT @TotalItems = SUM(OrderQty)

FROM Sales.SalesOrderDetail

WHERE SalesOrderID = @OrderID

GROUP BY SalesOrderID

RETURN @TotalItems;

END;

The function simply returns the total number of items associated with a sale in the SalesOrderDetail table, based on the inputted SalesOrderID value. You can then use a SELECT statement to run the function, in this case, entering a SalesOrderID value of 43659:

1	SELECT dbo.fnGetTotalItems(43659);

Based on the data in the sample database, the function returns a value of 26. If you were to use the function in a slightly more complex query, it might look something like the following:

SELECT SalesOrderID, OrderDate, SalesPersonID,

dbo.fnGetTotalItems(SalesOrderID) AS TotalItems

FROM Sales.SalesOrderHeader

ORDER BY SalesOrderID;

This time around, we’re incorporating the function into the SELECT list of a query retrieving data from the SalesOrderHeader table. The following table shows a partial list of results returned by that query.

SalesOrderID	OrderDate	SalesPersonID	TotalItems
43659	2005-07-01 00:00:00.000	279	26
43660	2005-07-01 00:00:00.000	279	2
43661	2005-07-01 00:00:00.000	282	38
43662	2005-07-01 00:00:00.000	282	54
43663	2005-07-01 00:00:00.000	276	1
43664	2005-07-01 00:00:00.000	280	14
43665	2005-07-01 00:00:00.000	283	20
43666	2005-07-01 00:00:00.000	276	7
43667	2005-07-01 00:00:00.000	277	6
43668	2005-07-01 00:00:00.000	282	93
43669	2005-07-01 00:00:00.000	283	1
43670	2005-07-01 00:00:00.000	275	6
43671	2005-07-01 00:00:00.000	283	17
43672	2005-07-01 00:00:00.000	282	9
43673	2005-07-01 00:00:00.000	275	20

For each row returned, a column has been added that uses the fnGetTotalItems function to calculate the total number of items sold for each order. As a result, the database engine calls the function for each row in the resultset. If we were returning millions of rows-or even billions!-all our operations could be impacted. For that reason, it’s often worth rewriting the function as a table-valued function that returns only one value:

USE AdventureWorks2012;

IF OBJECT_ID(N'dbo.ifGetTotalItems', N'IF') IS NOT NULL

DROP FUNCTION dbo.ifGetTotalItems;

CREATE FUNCTION dbo.ifGetTotalItems (@OrderID INT)

RETURNS TABLE

WITH SCHEMABINDING

RETURN

(

SELECT SUM(OrderQty) AS TotalItems

FROM Sales.SalesOrderDetail

WHERE SalesOrderID = @OrderID

GROUP BY SalesOrderID

);

The function again takes a single argument, the SalesOrderID value, but this time returns the results as a table. That means, when we call the function, we must do so in a place in the query that accepts table expressions. For that, we can use the APPLY operator to join the SalesOrderHeader table to the function in the FROM clause:

SELECT s.SalesOrderID, s.OrderDate, s.SalesPersonID,

f.TotalItems

FROM Sales.SalesOrderHeader s

CROSS APPLY dbo.ifGetTotalItems(s.SalesOrderID) f

ORDER BY SalesOrderID;

The statement returns the same results as the previous SELECT statement; only this time, the database engine usually calls the function only once and then applies it to each row.

You might not always be able to easily turn your scalar function into a table-value function, in which case, you’ll probably want to take another approach. You might, for example, create a stored procedure or simply create the necessary T-SQL, without encapsulating any of the logic. Whatever approach you take, you should test and compare your queries under realistic workloads and then determine which approach is best suited to your circumstances.

“I want to call the GETDATE system function from within a user-defined function, but I’ve read you cannot do this. Is there a workaround that let’s me use the GETDATE function?”

First off, despite the plethora of articles and blog posts that state you cannot use a nondeterministic function such as GETDATE within a user-defined function, it is simply not true, at least not in SQL Server 2008 R2 and SQL Server 2012, and from what I can tell, this has been the case since SQL Server 2005. For proof, check out the TechNet article “User-Defined Functions,” or try it out yourself. You’ll find that some nondeterministic functions, such as GETDATE, can indeed be called from within a user-defined function.

But first a step back to explain what we mean to deterministic and nondeterministic functions. A deterministic function is one that always returns the same results when given the same specific set of input values. For example, the SQRT system function will always return the same square root value of the inputted number if that number is always the same.

However, a nondeterministic function will not necessarily return the same results each time it runs, even if input values are the same. For example, the GETDATE, HOST_ID and NEWID system functions might return different results each time they’re called; therefore, they’re considered nondeterministic functions.

For quite a few years, the conventional wisdom has been that SQL Server does not let you call a nondeterministic function from within a user-defined function. However, the following CREATE FUNCTION statement will run with no problem:

USE AdventureWorks2012;

IF OBJECT_ID(N'dbo.GetJobData', N'IF') IS NOT NULL

DROP FUNCTION dbo.GetJobData;

CREATE FUNCTION dbo.GetJobData

(@title NVARCHAR(25), @HireDate DATETIME)

RETURNS TABLE

WITH SCHEMABINDING

RETURN

(

SELECT COUNT(*) AS TotalEmps

FROM HumanResources.Employee

WHERE JobTitle = @title

AND HireDate BETWEEN @HireDate AND

DATEADD(MM, -6, GETDATE())

GROUP BY JobTitle

);

The function determines the number of employees with the specified job title who have been hired between the specified date and current date, less six months. The function uses the GETDATE function to determine that six-month window.

Despite what you might have heard, SQL Server will create the function. Even when I included the HOST_ID function within the definition (just to test things out), SQL Server created the function. However, when I tried to incorporate the NEWID, RAND, TEXTPTR or NEWSEQUENTIALID nondeterministic functions into the function definition, I received an error message.

It turns out that SQL Server supports only some nondeterministic functions, such as GETDATE and HOST_ID. However, to be sure that GETDATE works correctly in our function, we can test it by calling the function and passing in the necessary parameter values:

1	SELECT * FROM dbo.GetJobData('Buyer', '2004-01-01');

In this case, the function returns only seven employees with the title of Buyer who have been hired within the specified date range. We can test the function further by using the APPLY operator to join the Employee and Person tables to the function, as in the following example:

SELECT

p.FirstName + ' ' + p.LastName AS FullName,

e.JobTitle,

f.TotalEmps

FROM

HumanResources.Employee e

INNER JOIN Person.Person p

ON e.BusinessEntityID = p.BusinessEntityID

CROSS APPLY dbo.GetJobData(e.JobTitle, e.HireDate) f

ORDER BY JobTitle;

For each row returned from the joined tables, the function is applied, based on the JobTitle and HireDate values. The following table shows a partial list of results returned by the query.

FullName	JobTitle	TotalEmps
Barbara Moreland	Accountant	2
Mike Seamans	Accountant	1
David Liu	Accounts Manager	1
Karen Berg	Application Specialist	1
Ramesh Meyyappan	Application Specialist	2
Dan Bacon	Application Specialist	3
Janaina Bueno	Application Specialist	4
Mindy Martin	Benefits Specialist	1
Mikael Sandberg	Buyer	9
Arvind Rao	Buyer	8
Linda Meisner	Buyer	7
Fukiko Ogisu	Buyer	6
Gordon Hee	Buyer	5
Frank Pellow	Buyer	4
Eric Kurjan	Buyer	3
Erin Hagens	Buyer	2
Ben Miller	Buyer	1
Ken SÃ¡nchez	Chief Executive Officer	1
Laura Norman	Chief Financial Officer	1

Notice how, for the Buyer job title, the number of total employees decreases with each row. This is because the date range is shrinking with each row, as the hire date is compared to the GETDATE value (minus the six months). The point is, the GETDATE function, a built-in nondeterministic function, works fine in our user-defined function. So don’t believe everything you read.

In cases when you want to use a nondeterministic function that is not permitted, such as NEWID, RAND, TEXTPTR or NEWSEQUENTIALID, you can get around this limitation by calling the system function within a view and then calling the view from within your user-defined function. Or instead consider creating a stored procedure that achieves what your after or just going with straight T-SQL and not encapsulating the logic.

“I want to execute a stored procedure from within a user-defined function, but I keep running into errors in the syntax. How do I call a stored procedure from within a function?”

The reason you’re receiving errors is because, technically, you cannot call a stored procedure from within a user-defined function, unless it is an extended stored procedure. But even there, time is running out. Extended stored procedures have been deprecated and will eventually disappear from the SQL Server landscape.

That said, you’ll find plenty of online articles and blog posts suggesting methods for bypassing SQL Server’s built-in limitation against calling stored procedures within a function. One suggested workaround is to use an xp_cmdshell command to run a batch file that executes the stored procedure. Another suggestion is to use the OPENQUERY built-in function to connect via a linked server and then call the stored procedure.

In either case, you’re essentially creating a second process to trick SQL Server into running the stored procedure. The problem with such an approach is that you can end up tricking other SQL Server components as well, and end up in a deadlock that SQL Server cannot resolve.

SQL Server functions are specifically designed to prevent them from being able to change the state of the database in any way. Because stored procedures can make such changes, you cannot run them from within a function, just like you cannot modify the schema or the stored data. Functions essentially read data and then usually, in some way, manipulate the read data. But a function should never impact the underlying schema or data. That means you cannot create a function that inadvertently corrupts your entire database. If you need a routine to run a batch that includes stored procedures, or that modifies data or metadata, you need a stored procedure.

“When creating user-defined functions, I like to include the SCHEMABINDING option, but in some cases, I receive an error when creating the function, saying that I cannot schema bind the function. Any idea what’s going on?”

My guess is that you’re referencing a view or user-defined function within your function that is itself not schema bound. You cannot schema bind a function if the referenced object isn’t bound.

Schema binding ensures that the function you’re creating is bound to the database objects that it references. When you specify the SCHEMABINDING option, no one can modify the base object in a way that would affect the function definition. The function must first be modified or dropped to remove any dependencies before the underlying objects can be changed.

However, when your function definition references a view or other user-defined function, that object must also be schema bound before you can apply the SCHEMABINDING function to your new function, otherwise, you’re not really protecting the new function from underlying schema changes.

Let’s look at an example to get a sense of how this works. Suppose we create a basic table-valued function similar to the one shown in the following example:

USE AdventureWorks2012;

IF OBJECT_ID(N'dbo.GetPersonTypeCount', N'IF') IS NOT NULL

DROP FUNCTION dbo.GetPersonTypeCount;

CREATE FUNCTION dbo.GetPersonTypeCount (@type NCHAR(2))

RETURNS TABLE

WITH SCHEMABINDING

RETURN

(

SELECT COUNT(*) AS PersonCount

FROM Person.Person

WHERE PersonType = @type

GROUP BY PersonType

);

Notice that the function’s SELECT statement references only the Person table and its columns. You can include the SCHEMABINDING option in the WITH clause with no problem because you’re directly referencing the database objects (the table and columns). If we were to call the function, as in the following example, it would return the number of people associated with a specific type, in this case, in.

1	SELECT * FROM dbo.GetPersonTypeCount('in');

The SELECT statement returns a value of 18484. We can achieve the same results be creating the following view and then qualifying our SELECT statement when we call the view:

USE AdventureWorks2012;

IF OBJECT_ID(N'dbo.PersonTypeCount', N'V') IS NOT NULL

DROP VIEW dbo.PersonTypeCount;

CREATE VIEW dbo.PersonTypeCount

SELECT PersonType, COUNT(*) AS PersonCount

FROM Person.Person

GROUP BY PersonType;

To return the same results as the preceding SELECT statement, we must include a WHERE clause to specify the person type, as shown in the following example:

1 2	SELECT PersonCount FROM dbo.PersonTypeCount WHERE PersonType = 'in';

Suppose we now want to update our function to reference the view, rather than include its own object. This allows us to pass in a parameter without having to define a WHERE clause each time we call the view. The function does it for us, as shown in the following definition:

USE AdventureWorks2012;

IF OBJECT_ID(N'dbo.GetPersonTypeCount', N'IF') IS NOT NULL

DROP FUNCTION dbo.GetPersonTypeCount;

CREATE FUNCTION dbo.GetPersonTypeCount (@type NCHAR(2))

RETURNS TABLE

RETURN

(

SELECT PersonCount FROM dbo.PersonTypeCount

WHERE PersonType = @type

);

Notice in this function definition we do not include the WITH SCHEMABINDING clause, as we did the first time we created the function. If we include it, we’ll receive an error message saying that we cannot schema bind the function. We would either have to remove the clause or re-create the view to include the WITH SCHEMABINDING clause, as is the following example:

USE AdventureWorks2012;

IF OBJECT_ID(N'dbo.PersonTypeCount', N'V') IS NOT NULL

DROP VIEW dbo.PersonTypeCount;

CREATE VIEW dbo.PersonTypeCount

WITH SCHEMABINDING

SELECT PersonType, COUNT(*) AS PersonCount

FROM Person.Person

GROUP BY PersonType;

After we re-create the view, we can then run our function definition again, but this time with the WITH SCHEMABINDING clause:

USE AdventureWorks2012;

IF OBJECT_ID(N'dbo.GetPersonTypeCount', N'IF') IS NOT NULL

DROP FUNCTION dbo.GetPersonTypeCount;

CREATE FUNCTION dbo.GetPersonTypeCount (@type NCHAR(2))

RETURNS TABLE

WITH SCHEMABINDING

RETURN

(

SELECT PersonCount FROM dbo.PersonTypeCount

WHERE PersonType = @type

);

Because we made the view schema bound, we can now do the same with the function. As a result, no schema changes can be made that will affect either the view or the function.

“I’ve seen user-defined functions called in different ways in code. Is there a proper way to call a function?”

The ways in which you can call a user-defined function depends on whether it is a scalar function or a table-valued function. For example, suppose we have the following scalar function:

USE AdventureWorks2012;

IF OBJECT_ID(N'dbo.fnGetTotalItems', N'FN') IS NOT NULL

DROP FUNCTION dbo.fnGetTotalItems;

CREATE FUNCTION dbo.fnGetTotalItems (@OrderID INT)

RETURNS INT

WITH SCHEMABINDING AS

BEGIN

DECLARE @TotalItems INT

SELECT @TotalItems = SUM(OrderQty)

FROM Sales.SalesOrderDetail

WHERE SalesOrderID = @OrderID

GROUP BY SalesOrderID

RETURN @TotalItems;

END;

The function takes an input parameter based on the SalesOrderID column in the SalesOrderDetail table and returns a total number of items associated with that sale. You can call the function within a column expression in your query’s SELECT, WHERE, or HAVING clause. For example, that query can be something as simple as the following:

1	SELECT dbo.fnGetTotalItems(43659);

In this case, the query returns a value of 26, which means order 43659 has 26 items associated with it. The column expression in this case is simply the function and its parameter value. However, we can create a query slightly more complex:

SELECT SalesOrderID, OrderDate, SalesPersonID,

dbo.fnGetTotalItems(SalesOrderID) AS TotalItems

FROM Sales.SalesOrderHeader

ORDER BY SalesOrderID;

Once again, the function is included as part of a column expression in the SELECT list. Only this time, we’ve assigned an alias to the column and specified the SalesOrderID column as the parameter’s value. The following table provides a partial list of the values returned by this statement.

SalesOrderID	OrderDate	SalesPersonID	TotalItems
43659	2005-07-01 00:00:00.000	279	26
43660	2005-07-01 00:00:00.000	279	2
43661	2005-07-01 00:00:00.000	282	38
43662	2005-07-01 00:00:00.000	282	54
43663	2005-07-01 00:00:00.000	276	1
43664	2005-07-01 00:00:00.000	280	14
43665	2005-07-01 00:00:00.000	283	20
43666	2005-07-01 00:00:00.000	276	7
43667	2005-07-01 00:00:00.000	277	6
43668	2005-07-01 00:00:00.000	282	93
43669	2005-07-01 00:00:00.000	283	1
43670	2005-07-01 00:00:00.000	275	6
43671	2005-07-01 00:00:00.000	283	17
43672	2005-07-01 00:00:00.000	282	9
43673	2005-07-01 00:00:00.000	275	20

Despite the differences between the preceding two SELECT statements, they’re essentially doing the same thing, in terms of calling the function as part of a column expression. However, SQL Server also lets you call a scalar function within an EXECUTE statement, as shown in the following example:

DECLARE @OrderQty INT;

SET @OrderQty = NULL;

EXEC @OrderQty = dbo.fnGetTotalItems @OrderID = 43659;

SELECT @OrderQty;

Like our first SELECT statement, the T-SQL here returns a value of 26; however, the way we get there is a bit more complicated in this example. First we declare the @OrderQty variable, set its value to NULL, and then use the EXECUTE statement to call the function and assign the returned value to the variable. Finally, we use a SELECT statement to retrieve the variable value.

Not surprisingly, the ways in which we can call a table-valued function are different because the function returns a full resultset (table). That means we must treat the function similar to how we would treat a table or view in our queries. Let’s look at an example to see how this works. The following T-SQL creates a table-valued function that contains logic similar to our previous example:

USE AdventureWorks2012;

IF OBJECT_ID(N'dbo.ifGetTotalItems', N'IF') IS NOT NULL

DROP FUNCTION dbo.ifGetTotalItems;

CREATE FUNCTION dbo.ifGetTotalItems (@OrderID INT)

RETURNS TABLE

WITH SCHEMABINDING

RETURN

(

SELECT SUM(OrderQty) AS TotalItems

FROM Sales.SalesOrderDetail

WHERE SalesOrderID = @OrderID

GROUP BY SalesOrderID

);

When calling the function, we must remember that we’re dealing with a table-like resultset. The simplest way to do this is to include the function in our query’s FROM clause, where a table expression is expected:

1	SELECT TotalItems FROM dbo.ifGetTotalItems(43659);

Once again, our SELECT statement returns a value of 26. However, if the function had returned more rows and more columns, those would be included as well. In this case, the SELECT statement returns whatever results the function generates.

When working with table-valued functions, you can also use the APPLY operator to join a table to the function, as shown in the following example:

SELECT s.SalesOrderID, s.OrderDate, s.SalesPersonID,

f.TotalItems

FROM Sales.SalesOrderHeader s

CROSS APPLY dbo.ifGetTotalItems(s.SalesOrderID) f

ORDER BY SalesOrderID;

This time around, we’re calling the function in the FROM clause (by joining it to the table) and including the function’s returned value (TotalItems) in the SELECT list. The SELECT statement will then return the same results shown with the preceding scalar function example. Be aware, however; you cannot use an EXECUTE statement to call a table-valued function. That is reserved for scalar functions only.

“I’m working on a query whose T-SQL code I want to encapsulate and parameterize. The query will aggregate a subset of values based on an input parameter and return the aggregated value as a column in the query results. Should I create a stored procedure or a user-defined function?”

Based on what you’re describing, a user-defined function is likely the way to go. SQL Server lets you call a function from within a SELECT statement, which you cannot do with a stored procedure. However, both provide a mechanism for encapsulating your T-SQL and passing in parameters. With stored procedures, though, you can define both input and output parameters. Functions are limited to input parameters. In addition, a function must return a single result, either a scalar value or a table. Stored procedures can return a single result, multiple results, or no results.

Another difference is that you can include all sorts of T-SQL statements within a stored procedure, which means you can retrieve data, modify data, create tables, delete tables, or take a variety of other actions. In a user-defined function, you’re basically limited to retrieving data, along with creating and managing local variables and calling extended stored procedures (which have been deprecated and will one day disappear altogether). Basically, you cannot take any action in a function that would modify the database state.

Also worth nothing is that you can call a function from within a stored procedure, but not the other way around (at least not without some clunky and often risky workarounds). Stored Procedures also let you incorporate TRY...CATCH error handling into your code. Functions do not.

It’s not all bad news for functions, however. One of their biggest advantages, as noted above, is the ability to call them from within your SELECT statements. Not only can you use functions in the SELECT list, but also within your WHERE and HAVING clauses. In addition, you can join a table-valued function in your FROM clause to one or more tables, usually with the help of the APPLY operator. Plus, you can use functions in your computed column and CHECK constraint definitions. Try doing that with a stored procedure.

Register for Simple Talk

TSQL User-Defined Functions: Ten Questions You Were Too Shy To Ask

The Questions

“I’m creating a scalar function that will be used against a column that contains multiple NULL values. Is there a way to stop the function from being executed when a NULL is passed in as a parameter value?”

“I created a table-valued function that I want to apply to a column in each row returned by the query. However, when I try to join the function to the target table in the query’s FROM clause, I receive an error. Is there an effective way to invoke the function for each row in the result set?”

“I’ve heard you can use a function to parameterize a view, but I don’t see how you can incorporate a function into a view definition in such a way to support parameters. Can you explain how that is done?”

“I’m creating a user-defined function and want to specify a default value for an input parameter. Is that possible?”

“I created a scalar function that’s used in multiple queries, some of which can return millions of rows. Ever since I incorporated the function into the queries, performance has nosedived. Are there steps we can take to fix this?”

“I want to call the GETDATE system function from within a user-defined function, but I’ve read you cannot do this. Is there a workaround that let’s me use the GETDATE function?”

“I want to execute a stored procedure from within a user-defined function, but I keep running into errors in the syntax. How do I call a stored procedure from within a function?”

“When creating user-defined functions, I like to include the SCHEMABINDING option, but in some cases, I receive an error when creating the function, saying that I cannot schema bind the function. Any idea what’s going on?”

“I’ve seen user-defined functions called in different ways in code. Is there a proper way to call a function?”

“I’m working on a query whose T-SQL code I want to encapsulate and parameterize. The query will aggregate a subset of values based on an input parameter and return the aggregated value as a column in the query results. Should I create a stored procedure or a user-defined function?”

Further Reading

Article tags

About the author

Robert Sheldon

Robert's contributions

Articles

Books

Top topics

Robert's latest contributions:

Importing and exporting data in MongoDB Compass

Creating Time Series Collections in MongoDB

Working with MongoDB Indexes

The Questions

“I’m creating a scalar function that will be used against a column that contains multiple NULL values. Is there a way to stop the function from being executed when a NULL is passed in as a parameter value?”

“I created a table-valued function that I want to apply to a column in each row returned by the query. However, when I try to join the function to the target table in the query’s FROM clause, I receive an error. Is there an effective way to invoke the function for each row in the result set?”

“I’ve heard you can use a function to parameterize a view, but I don’t see how you can incorporate a function into a view definition in such a way to support parameters. Can you explain how that is done?”

“I’m creating a user-defined function and want to specify a default value for an input parameter. Is that possible?”

“I created a scalar function that’s used in multiple queries, some of which can return millions of rows. Ever since I incorporated the function into the queries, performance has nosedived. Are there steps we can take to fix this?”

“I want to call the GETDATE system function from within a user-defined function, but I’ve read you cannot do this. Is there a workaround that let’s me use the GETDATE function?”

“I want to execute a stored procedure from within a user-defined function, but I keep running into errors in the syntax. How do I call a stored procedure from within a function?”

“When creating user-defined functions, I like to include the SCHEMABINDING option, but in some cases, I receive an error when creating the function, saying that I cannot schema bind the function. Any idea what’s going on?”

“I’ve seen user-defined functions called in different ways in code. Is there a proper way to call a function?”

“I’m working on a query whose T-SQL code I want to encapsulate and parameterize. The query will aggregate a subset of values based on an input parameter and return the aggregated value as a column in the query results. Should I create a stored procedure or a user-defined function?”

Further Reading

Article tags

Recommended

About the author

Robert Sheldon

Robert's contributions

Articles

Books

Top topics

Robert's latest contributions:

Importing and exporting data in MongoDB Compass

Creating Time Series Collections in MongoDB

Working with MongoDB Indexes